Small variants (SNV/INDEL) coverage and variant allele frequency (VAF) distribution
Expand
Mutational signatures
Mutational signature is estimated using R package MutationalPattern based on SNVs only (INDELs are ignored).
Expand
Notes on small variants (SNV/INDEL) filtering
Variants are filtered with any of the following criteria:
IMPACT is HIGH or MODERATE
CLIN_SIG contains pathogenic (Pathogenic intron variants will be retained)
CANCER_TYPE is not NA (variants that are in IntOGen Cancer Gene Census)
MAX_AF (maximum population allele frequency) is less than 3%
CANCER_TYPE_ROLE and CANCER_TYPE_CGC_GENE are merged columns from CANCER_TYPE, ROLE and CGC_CANCER_GENE. These columns are collapsed into single entries separated by semicolon. E.g. CANCER_TYPE = “Breast;Prostate” and ROLE - “LoF;Act” means that the gene is a LoF in breast cancer and an Act in prostate cancer. This is done so that the table is more readable.
Expand
Small variants (SNV/INDEL) table
Expand
Notes on structural variants (SVs)
SVs are filtered to only those that are part of the IntOGen Cancer Gene Census (CGC)
Annotation based on AnnotSV. However to make the output readable some columns with very long information (e.g. “_coord” and “_source”) are removed. Please refer to original AnnotSV output for more information.
AnnotSV converts square bracketed notation using the harmonization rule from variant-extractor, which may result in wrong conversion, especially in BND to DEL conversion.
Capital letter columns are from IntOGen CGC. Please see README from the IntOGen release for more information.
CANCER_TYPE_ROLE and CANCER_TYPE_CGC_GENE are merged columns from CANCER_TYPE, ROLE and CGC_CANCER_GENE. These columns are collapsed into single entries separated by semicolon. E.g. CANCER_TYPE = “Breast;Prostate” and ROLE - “LoF;Act” means that the gene is a LoF in breast cancer and an Act in prostate cancer. This is done so that the table is more readable.
Each SV can affect multiple genes. AnnotSV “splits” the different genes into different entries. This is why there are multiple rows with the same AnnotSV_ID.
ALT allele for insertion is hidden as “Too long” in the table. Please refer to the original AnnotSV output for more information.
Note that Severus can call duplication as BND event, and AnnotSV has a tendency to annotate these as DEL event since it doesn’t make use of the “STRAND” information. Therefore, the “SV_type” column is not very accurate for BND events (You will recognize these with SEVERUS_BND in the ID column)
The “SAMPLE” column represents the FORMAT column in the VCF. For Severus this is “GT:GQ:VAF:hVAF:DR:DV”